I have migrated to xet storage. And today, I try to test that is the xet really working?
My test is, simply generate a all one (int) array using numpy, and upload it to huggingface.
import numpy as np
a = np.ones(10000000,dtype=int)
np.save("./one.npy", a)
And update it.
pip install -U "huggingface_hub[cli,hf_xet]"
huggingface-cli.exe upload lyk/XetTest . --repo-type=dataset
Start hashing 1 files.
Finished hashing 1 files.
Uploading files using Xet Storage..
It shows that I am using xet but finally I got the LFS storage at 40MB, just as large as the raw simple file, no deduplication.
Well, maybe it only dedupilcates history commits. And I just generate a twice large file
import numpy as np
a = np.ones(10000000,dtype=int)
np.save("./one.npy", a)
Then I upload it and get a 120MB LFS storage usage.
And during the whole process, the progress bar in terminal shows that I uploaded the whole files(40MB and 80MB) although xet is enabled.
I don’t know why xet does not work. Any thing wrong?